If volume is delinquent, switch owner of volume/engine/replica to share manager CR's owner (backport #3004) #3005

mergify · 2024-07-24T22:52:26Z

Without the fix, volume/engine/replica continue to wait for the share manager pod to be scheduled (i.e., pod.Spec.NodeName is non empty) to set ownerID to the same pod's node. However, because we don't want to use pod's imformers, when the share manager pod is scheduled, volume/engine/controller might not catch that event and continue to wait. This introduce up to 30s delay and behavioral inconsistency

Also, the > 30s delay in share manager pod recreation is destroying the RWX fast failover's original goal

longhorn/longhorn#6205

Some testing results:

Before the fix, it was taking from 15s to 70s for the new share manager pod to become running after shutting down the node of the old share manager pod
After the fix, it is taking from 15s to 17s for the new share manager pod to become running after shutting down the node of the old share manager pod

This is an automatic backport of pull request #3004 done by [Mergify](https://mergify.com).

manager CR's owner Without the fix, volume/engine/replica continue to wait for the share manager pod to be scheduled (i.e., pod.Spec.NodeName is non empty) to set ownerID to the same pod's node. However, because we don't want to use pod's imformers, when the share manager pod is scheduled, volume/engine/controller might not catch that event and continue to wait. This introduce up to 30s delay and behavioral inconsistency Also the > 30s delay in share manager pod recreation is destroying the RWX fast failover's original goal longhorn-6205 Signed-off-by: Phan Le <phan.le@suse.com> (cherry picked from commit 18ac5dc)

mergify bot mentioned this pull request Jul 24, 2024

If volume is delinquent, switch owner of volume/engine/replica to share manager CR's owner #3004

Merged

PhanLe1010 approved these changes Jul 24, 2024

View reviewed changes

PhanLe1010 merged commit 590fac1 into v1.7.x Jul 24, 2024
6 checks passed

PhanLe1010 deleted the mergify/bp/v1.7.x/pr-3004 branch July 24, 2024 23:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

If volume is delinquent, switch owner of volume/engine/replica to share manager CR's owner (backport #3004) #3005

If volume is delinquent, switch owner of volume/engine/replica to share manager CR's owner (backport #3004) #3005

mergify bot commented Jul 24, 2024

If volume is delinquent, switch owner of volume/engine/replica to share manager CR's owner (backport #3004) #3005

If volume is delinquent, switch owner of volume/engine/replica to share manager CR's owner (backport #3004) #3005

Conversation

mergify bot commented Jul 24, 2024